Search CORE

14 research outputs found

Linear Systems over Join-Blank Algebras

Author: Jananthan Hayden
Kepner Jeremy
Kim Suna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/10/2017
Field of study

A central problem of linear algebra is solving linear systems. Regarding linear systems as equations over general semirings (V,otimes,oplus,0,1) instead of rings or fields makes traditional approaches impossible. Earlier work shows that the solution space X(A;w) of the linear system Av = w over the class of semirings called join-blank algebras is a union of closed intervals (in the product order) with a common terminal point. In the smaller class of max-blank algebras, the additional hypothesis that the solution spaces of the 1x1 systems Av = w are closed intervals implies that X(A;w) is a finite union of closed intervals. We examine the general case, proving that without this additional hypothesis, we can still make X(A;w) into a finite union of quasi-intervals

arXiv.org e-Print Archive

Caltech Authors

Constructing Adjacency Arrays from Incidence Arrays

Author: Dibert Karia
Jananthan Hayden
Kepner Jeremy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2017
Field of study

Graph construction, a fundamental operation in a data processing pipeline, is typically done by multiplying the incidence array representations of a graph,

\mathbf{E}_\mathrm{in}

and

\mathbf{E}_\mathrm{out}

, to produce an adjacency array of the graph,

\mathbf{A}

, that can be processed with a variety of algorithms. This paper provides the mathematical criteria to determine if the product

\mathbf{A} = \mathbf{E}^{\sf T}_\mathrm{out}\mathbf{E}_\mathrm{in}

will have the required structure of the adjacency array of the graph. The values in the resulting adjacency array are determined by the corresponding addition

\oplus

and multiplication

\otimes

operations used to perform the array multiplication. Illustrations of the various results possible from different

\oplus

and

\otimes

operations are provided using a small collection of popular music metadata.Comment: 8 pages, 5 figures, accepted to IEEE IPDPS 2017 Workshop on Graph Algorithm Building Block

arXiv.org e-Print Archive

Crossref

Polystore mathematics of relational algebra

Author: Gadepally Vijay
Hutchison Dylan
Jananthan Hayden
Kepner Jeremy
Kim Suna
Zhou Ziqi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2017
Field of study

Financial transactions, internet search, and data analysis are all placing increasing demands on databases. SQL, NoSQL, and NewSQL databases have been developed to meet these demands and each offers unique benefits. SQL, NoSQL, and NewSQL databases also rely on different underlying mathematical models. Polystores seek to provide a mechanism to allow applications to transparently achieve the benefits of diverse databases while insulating applications from the details of these databases. Integrating the underlying mathematics of these diverse databases can be an important enabler for polystores as it enables effective reasoning across different databases. Associative arrays provide a common approach for the mathematics of polystores by encompassing the mathematics found in different databases: sets (SQL), graphs (NoSQL), and matrices (NewSQL). Prior work presented the SQL relational model in terms of associative arrays and identified key mathematical properties that are preserved within SQL. This work provides the rigorous mathematical definitions, lemmas, and theorems underlying these properties. Specifically, SQL Relational Algebra deals primarily with relations - multisets of tuples - and operations on and between those relations. These relations can be modeled as associative arrays by treating tuples as non-zero rows in an array. Operations in relational algebra are built as compositions of standard operations on associative arrays which mirror their matrix counterparts. These constructions provide insight into how relational algebra can be handled via array operations. As an example application, the composition of two projection operations is shown to also be a projection, and the projection of a union is shown to be equal to the union of the projections

arXiv.org e-Print Archive

Caltech Authors

Polystore mathematics of relational algebra

Author: Gadepally Vijay
Hutchison Dylan
Jananthan Hayden
Kepner Jeremy
Kim Suna
Zhou Ziqi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2017
Field of study

Large Scale Enrichment and Statistical Cyber Characterization of Network Traffic

Author: Buluç Aydın
Davis Tim
Elsakkary Youssef
Estrada Arminda
Grant Daniel
Jananthan Hayden
Jones Michael
Kawaminami Ivan
Kepner Jeremy
Meiners Chad
Morris Andrew
Pisharody Sandeep
Publication venue
Publication date: 07/09/2022
Field of study

Modern network sensors continuously produce enormous quantities of raw data that are beyond the capacity of human analysts. Cross-correlation of network sensors increases this challenge by enriching every network event with additional metadata. These large volumes of enriched network data present opportunities to statistically characterize network traffic and quickly answer a key question: "What are the primary cyber characteristics of my network data?" The Python GraphBLAS and PyD4M analysis frameworks enable anonymized statistical analysis to be performed quickly and efficiently on very large network data sets. This approach is tested using billions of anonymized network data samples from the largest Internet observatory (CAIDA Telescope) and tens of millions of anonymized records from the largest commercially available background enrichment capability (GreyNoise). The analysis confirms that most of the enriched variables follow expected heavy-tail distributions and that a large fraction of the network traffic is due to a small number of cyber activities. This information can simplify the cyber analysts' task by enabling prioritization of cyber activities based on statistical prevalence.Comment: 8 pages, 8 figures, HPE

arXiv.org e-Print Archive

The University of Arizona

pPython Performance Study

Author: Arcand William
Bergeron Bill
Bestor David
Byun Chansup
Gadepally Vijay
Houle Michael
Hubbell Matthew
Jananthan Hayden
Jones Michael
Kepner Jeremy
Klein Anna
Michaleas Peter
Milechin Lauren
Morales Guillermo
Mullen Julie
Prout Andrew
Reuther Albert
Rosa Antonio
Samsi Siddharth
Yee Charles
Publication venue
Publication date: 07/09/2023
Field of study

pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. pPython follows a SPMD (single program multiple data) model of computation. pPython runs on a single-node (e.g., a laptop) running Windows, Linux, or MacOS operating systems or on any combination of heterogeneous systems that support Python, including on a cluster through a Slurm scheduler interface so that pPython can be executed in a massively parallel computing environment. It is interesting to see what performance pPython can achieve compared to the traditional socket-based MPI communication because of its unique file-based messaging implementation. In this paper, we present the point-to-point and collective communication performances of pPython and compare them with those obtained by using mpi4py with OpenMPI. For large messages, pPython demonstrates comparable performance as compared to mpi4py.Comment: arXiv admin note: substantial text overlap with arXiv:2208.1490

arXiv.org e-Print Archive

Deployment of Real-Time Network Traffic Analysis using GraphBLAS Hypersparse Matrices and D4M Associative Arrays

Author: Arcand William
Bergeron William
Bestor David
Byun Chansup
Davis Timothy
Gadepally Vijay
Houle Micheal
Hubbell Matthew
Jananthan Hayden
Jones Michael
Kepner Jeremy
Klein Anna
Michaleas Peter
Milechin Lauren
Morales Guillermo
Mullen Julie
Patel Ritesh
Pisharody Sandeep
Prout Andrew
Reuther Albert
Rosa Antonio
Samsi Siddharth
Yee Charles
Publication venue
Publication date: 04/09/2023
Field of study

Matrix/array analysis of networks can provide significant insight into their behavior and aid in their operation and protection. Prior work has demonstrated the analytic, performance, and compression capabilities of GraphBLAS (graphblas.org) hypersparse matrices and D4M (d4m.mit.edu) associative arrays (a mathematical superset of matrices). Obtaining the benefits of these capabilities requires integrating them into operational systems, which comes with its own unique challenges. This paper describes two examples of real-time operational implementations. First, is an operational GraphBLAS implementation that constructs anonymized hypersparse matrices on a high-bandwidth network tap. Second, is an operational D4M implementation that analyzes daily cloud gateway logs. The architectures of these implementations are presented. Detailed measurements of the resources and the performance are collected and analyzed. The implementations are capable of meeting their operational requirements using modest computational resources (a couple of processing cores). GraphBLAS is well-suited for low-level analysis of high-bandwidth connections with relatively structured network data. D4M is well-suited for higher-level analysis of more unstructured data. This work demonstrates that these technologies can be implemented in operational settings.Comment: Accepted to IEEE HPEC, 8 pages, 8 figures, 1 table, 69 references. arXiv admin note: text overlap with arXiv:2203.13934. text overlap with arXiv:2309.0180

arXiv.org e-Print Archive